Efficient MPI Support for Advanced Hybrid Programming Models
نویسندگان
چکیده
The number of multithreaded Message Passing Interface (MPI) implementations and applications is increasing rapidly. We discuss how multithreaded applications can receive messages of unknown size. As is well known, combining MPI Probe/MPI Recv is not threadsafe, but many assume that trivial workarounds exist. We discuss those workarounds and show how they fail in practice by either limiting the available parallelism unnecessarily, consuming resources in a non-scalable way, or promoting global deadlocks. In this light, we propose two fundamentally different efficient approaches to enable thread-safe messaging in MPI-2.2: fine-grained locking and matching outside of MPI. Our approaches provide thread-safe probe and receive functionality, but both have deficiencies, including performance limitations and programming complexity, that could be avoided if MPI would offer a thread-safe (stateless) interface to MPI Probe. We propose such an extension for the upcoming MPI-3 standard, provide a reference implementation, and demonstrate significant performance benefits.
منابع مشابه
Toward Efficient Support for Multithreaded MPI Communication
To make the most effective use of parallel machines that are being built out of increasingly large multicore chips, researchers are exploring the use of programming models comprising a mixture of MPI and threads. Such hybrid models require efficient support from an MPI implementation for MPI messages sent from multiple threads simultaneously. In this paper, we explore the issues involved in des...
متن کاملFine-Grained Multithreading Support for Hybrid Threaded MPI Programming
As high-end computing systems continue to grow in scale, recent advances in multiand many-core architectures have pushed such growth toward more denser architectures, that is, more processing elements per physical node, rather than more physical nodes themselves. Although a large number of scientific applications have relied so far on an MPI-everywhere model for programming high-end parallel sy...
متن کاملAdvanced Hybrid MPI/OpenMP Parallelization Paradigms for Nested Loop Algorithms onto Clusters of SMPs
The parallelization process of nested-loop algorithms onto popular multi-level parallel architectures, such as clusters of SMPs, is not a trivial issue, since the existence of data dependencies in the algorithm impose severe restrictions on the task decomposition to be applied. In this paper we propose three techniques for the parallelization of such algorithms, namely pure MPI parallelization,...
متن کاملCommunication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distributed memory parallelization on the node inter-connect with the shared memory parallelization inside of each node. The hybrid MPI+OpenMP programming model is compared with pure MPI, compiler based parallelization, and other parallel programming models on hybrid architectures. The paper focuses on b...
متن کاملHybrid MPI/OpenMP Application on Multicore Architectures: The Case of Profit-Sharing Life Insurance Policies Valuation
Abstract The DISAR (Dynamic Investment Strategy with Accounting Rules) system – an Asset-Liability Management software for monitoring portfolios of life insurance policies – has been proven to be extremely efficient on a grid of conventional computers. However, when executed on multicore architectures, it is fundamental to face new challenges, due to the machine characteristics, in order to imp...
متن کامل